Skip to content

Conversation

@pmachapman
Copy link
Collaborator

@pmachapman pmachapman commented Oct 29, 2025

Fixes #765 and #784.

This is not a complete refactor of all of the tests, but I have simplified use of Paratext projects in e2e tests, which should make future tests easier. I've also updated the test data to be public domain data for both Paratext and text files.


This change is Reviewable

Copy link
Collaborator

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking a while to get to this! Thank you for doing this, Peter!

@Enkidu93 reviewed 52 of 52 files at r1, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @ddaspit)


src/Serval/test/Serval.E2ETests/data/TestProject/grc.ldml line 0 at r1 (raw file):
Do we need to check this in?


src/Serval/test/Serval.E2ETests/ServalClientHelper.cs line 593 at r1 (raw file):

        var sourceFileConfig = new List<CorpusFileConfig>();

        if (sourceLanguage == targetLanguage && !inference)

This logic always felt a little awkward to me. At a minimum, since this is used a few times, could we move it to a method?


src/Serval/test/Serval.E2ETests/data/sbp_usfm/sbp.ldml line 0 at r1 (raw file):
Do we need to check this in for our purposes? Or can we just remove it?


src/Serval/test/Serval.E2ETests/ServalApiTests.cs line 221 at r1 (raw file):

            }
        ];
        await StartAndCancelTwiceAsync(engineId);

Could we just do the StartAndCancelTwice() as part of another test and then go on to run it as usual? It probably wouldn't save a ton of time, but it would be something.


src/Serval/test/Serval.E2ETests/ServalApiTests.cs line 225 at r1 (raw file):

    [Test]
    public async Task Nmt_LargeBatchAndDownload()

Could this be combined with Nmt_Batch?


src/Serval/test/Serval.E2ETests/ServalApiTests.cs line 504 at r1 (raw file):

        await _helperClient.AddParatextCorpusToEngineAsync(engineId, "es", "en", false);

        await StartAndCancelTwiceAsync(engineId);

See comment above.


src/Serval/test/Serval.E2ETests/data/en_usfm/en.ldml line 1 at r1 (raw file):

<ldml><identity><version number="" /><generation date="2018-04-03T20:44:50Z" /><language type="en" /><special xmlns:sil="urn://www.sil.org/ldml/0.1"><sil:identity windowsLCID="1033" /></special></identity><characters><exemplarCharacters>[A-Za-z]</exemplarCharacters><exemplarCharacters type="punctuation">[!'-),-.\:;?\[\]\u00B4\u200C\u200D\u2014\u2018\u2019\u201C\u201D]</exemplarCharacters><special xmlns:sil="urn://www.sil.org/ldml/0.1"><sil:exemplarCharacters type="wordFormingPunctuation">['\-\u00B4\u2014]</sil:exemplarCharacters><sil:exemplarCharacters type="footnotes">[]</sil:exemplarCharacters><sil:exemplarCharacters type="crossrefs">[]</sil:exemplarCharacters><sil:exemplarCharacters type="verseSegments">[a b c d e f g h i j k l m n o p q r s t u v w x y z {aa} {bb} {cc} {dd} {ee} {ff} {gg} {hh} {ii} {jj} {kk} {ll} {mm} {nn} {oo} {pp} {qq} {rr} {ss} {tt} {uu} {vv} {ww} {xx} {yy} {zz}]</sil:exemplarCharacters><sil:exemplarCharacters type="diacritics">[]</sil:exemplarCharacters><sil:exemplarCharacters type="wordBreaks">[]</sil:exemplarCharacters></special></characters><delimiters><quotationStart>“</quotationStart><quotationEnd>”</quotationEnd><alternateQuotationStart>‘</alternateQuotationStart><alternateQuotationEnd>’</alternateQuotationEnd><special xmlns:sil="urn://www.sil.org/ldml/0.1"><sil:matched-pairs><sil:matched-pair open="(" close=")" paraClose="false" /><sil:matched-pair open="[" close="]" paraClose="false" /><sil:matched-pair open="{" close="}" paraClose="false" /></sil:matched-pairs><sil:punctuation-patterns><sil:punctuation-pattern pattern="._" context="final" /><sil:punctuation-pattern pattern="!_" context="final" /><sil:punctuation-pattern pattern="?_" context="final" /><sil:punctuation-pattern pattern="-" context="medial" /><sil:punctuation-pattern pattern="'" context="medial" /><sil:punctuation-pattern pattern="´" context="medial" /><sil:punctuation-pattern pattern="—" context="medial" /><sil:punctuation-pattern pattern="‌" context="medial" /><sil:punctuation-pattern pattern="‍" context="medial" /></sil:punctuation-patterns><sil:quotation-marks paraContinueType="all"><sil:quotationContinue></sil:quotationContinue><sil:quotation open="“" close="”" level="3" /></sil:quotation-marks></special></delimiters><layout><orientation><characterOrder>left-to-right</characterOrder></orientation></layout><collations><defaultCollation>standard</defaultCollation><collation type="standard"><cr><![CDATA[&[before 1] [first regular] < a\/A < b\/B < c\/C < d\/D < e\/E < f\/F < g\/G < h\/H < i\/I < j\/J < k\/K < l\/L < m\/M < n\/N < o\/O < p\/P < q\/Q < r\/R < s\/S < t\/T < u\/U < v\/V < w\/W < x\/X < y\/Y < z\/Z]]></cr><special xmlns:sil="urn://www.sil.org/ldml/0.1"><sil:simple><![CDATA[a/A

Do we need to check this in?


src/Serval/test/Serval.E2ETests/data/en_usfm/ProjectProgress.xml line 1 at r1 (raw file):

<?xml version="1.0" encoding="utf-8"?>

Same with these: Can we get away with not adding these to the repo?

Copy link
Collaborator Author

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 40 of 52 files reviewed, 8 unresolved discussions (waiting on @ddaspit and @Enkidu93)


src/Serval/test/Serval.E2ETests/ServalApiTests.cs line 221 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Could we just do the StartAndCancelTwice() as part of another test and then go on to run it as usual? It probably wouldn't save a ton of time, but it would be something.

Done.


src/Serval/test/Serval.E2ETests/ServalApiTests.cs line 225 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Could this be combined with Nmt_Batch?

I think this test is too different, as it uses a quite different training corpus, and it persists the model, unlike Nmt_Batch, so will need a new translation engine created anyway.


src/Serval/test/Serval.E2ETests/ServalApiTests.cs line 504 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

See comment above.

Done.


src/Serval/test/Serval.E2ETests/ServalClientHelper.cs line 593 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

This logic always felt a little awkward to me. At a minimum, since this is used a few times, could we move it to a method?

I've removed the logic outright, as I don't think it is helpful to have echo configured differently (beyond languages code of course), and removing it doesn't affect the test results


src/Serval/test/Serval.E2ETests/data/TestProject/grc.ldml line at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Do we need to check this in?

Done. No - this is a deletion from the old test project.


src/Serval/test/Serval.E2ETests/data/en_usfm/en.ldml line 1 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Do we need to check this in?

Done. I have removed this.


src/Serval/test/Serval.E2ETests/data/en_usfm/ProjectProgress.xml line 1 at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Same with these: Can we get away with not adding these to the repo?

Done. I have removed this.


src/Serval/test/Serval.E2ETests/data/sbp_usfm/sbp.ldml line at r1 (raw file):

Previously, Enkidu93 (Eli C. Lowry) wrote…

Do we need to check this in for our purposes? Or can we just remove it?

Done. I have removed this.

@codecov-commenter
Copy link

codecov-commenter commented Nov 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.06%. Comparing base (fd5a372) to head (07193fb).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #812   +/-   ##
=======================================
  Coverage   66.06%   66.06%           
=======================================
  Files         371      371           
  Lines       20068    20068           
  Branches     2627     2627           
=======================================
  Hits        13258    13258           
  Misses       5873     5873           
  Partials      937      937           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@ddaspit reviewed 40 of 52 files at r1, 12 of 12 files at r2, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @Enkidu93)

Copy link
Collaborator

@Enkidu93 Enkidu93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@Enkidu93 reviewed 12 of 12 files at r2, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @pmachapman)


src/Serval/test/Serval.E2ETests/ServalApiTests.cs line 225 at r1 (raw file):

Previously, pmachapman (Peter Chapman) wrote…

I think this test is too different, as it uses a quite different training corpus, and it persists the model, unlike Nmt_Batch, so will need a new translation engine created anyway.

OK. If the main purpose of this test is to test the model downloading, could we not just persist the model in Nmt_Batch and then retrieve it at the end? If you think the set up here is different enough to constitute a separate test, I'm fine with that too 👍. I guess this does push the boundaries on large files - although you could always work the large file into the other test.


src/Serval/test/Serval.E2ETests/ServalClientHelper.cs line 593 at r1 (raw file):

Previously, pmachapman (Peter Chapman) wrote…

I've removed the logic outright, as I don't think it is helpful to have echo configured differently (beyond languages code of course), and removing it doesn't affect the test results

🎉 Great, thank you!

Copy link
Collaborator Author

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman dismissed @Enkidu93 from a discussion.
Reviewable status: all files reviewed (commit messages unreviewed), all discussions resolved (waiting on @pmachapman)

Copy link
Collaborator Author

@pmachapman pmachapman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmachapman reviewed all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @pmachapman)

@pmachapman pmachapman merged commit 5d8e83a into main Nov 6, 2025
3 checks passed
@pmachapman pmachapman deleted the refactor_tests branch November 6, 2025 20:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Don't use IsEqualTo for AlignedWordPairs in GetEchoWordAlignment E2E test Refactor ServalClientHelper and ServalApiTests

5 participants